-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7689] Remove TTL-based metadata cleaning in Spark 2.0 #10534
Conversation
Good to remove it :). It has been obsolete for a long time, and will potentially lead to some unexpected behaviors especially in Streaming (like block not found). |
Test build #48528 has finished for PR 10534 at commit
|
LGTM. We should let @tdas take a look at this though. |
@@ -81,6 +81,7 @@ class StreamingContextSuite extends SparkFunSuite with BeforeAndAfter with Timeo | |||
|
|||
test("from conf with settings") { | |||
val myConf = SparkContext.updatedConf(new SparkConf(false), master, appName) | |||
// TODO(josh): Update these exmaples to use a different configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO.
Test build #48592 has finished for PR 10534 at commit
|
Test build #48590 has finished for PR 10534 at commit
|
Test build #48601 has finished for PR 10534 at commit
|
Jenkins, retest this please. |
All the recent test builds failed due to the same issue. |
Test build #48616 has finished for PR 10534 at commit
|
Jenkins, retest this please. |
Test build #48645 has finished for PR 10534 at commit
|
+1 |
@@ -75,7 +77,7 @@ private[spark] class BlockManager( | |||
|
|||
val diskBlockManager = new DiskBlockManager(this, conf) | |||
|
|||
private val blockInfo = new TimeStampedHashMap[BlockId, BlockInfo] | |||
private val blockInfo = new ConcurrentHashMap[BlockId, BlockInfo] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isnt it simpler to keep the Scala map interface? Will minimize changes in rest of code in this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was worried about the pitfalls of .getOrElseUpdate
not being atomic on ConcurrentHashMaps
that had been wrapped into Scala maps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
Test build #48693 has finished for PR 10534 at commit
|
0868fab
to
8b7ca1c
Compare
@@ -1439,7 +1428,7 @@ Apart from these, the following properties are also available, and may be useful | |||
<td> | |||
A comma separated list of ciphers. The specified ciphers must be supported by JVM. | |||
The reference list of protocols one can find on | |||
<a href="https://blogs.oracle.com/java-platform-group/entry/diagnosing_tls_ssl_and_https">this</a> | |||
a href="https://blogs.oracle.com/java-platform-group/entry/diagnosing_tls_ssl_and_https">this</a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this < removed??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, thanks for catching that.
LGTM, pending tests. |
Test build #48878 has finished for PR 10534 at commit
|
Jenkins, retest this please. |
Test build #48889 has finished for PR 10534 at commit
|
Merging. Thanks. |
Cross-references to related PRs (to aid other code archaeologists):
|
This PR removes
spark.cleaner.ttl
and the associated TTL-based metadata cleaning code.Now that we have the
ContextCleaner
and a timer to trigger periodic GCs, I don't think thatspark.cleaner.ttl
is necessary anymore. The TTL-based cleaning isn't enabled by default, isn't included in our end-to-end tests, and has been a source of user confusion when it is misconfigured. If the TTL is set too low, data which is still being used may be evicted / deleted, leading to hard to diagnose bugs.For all of these reasons, I think that we should remove this functionality in Spark 2.0. Additional benefits of doing this include marginally reduced memory usage, since we no longer need to store timetsamps in hashmaps, and a handful fewer threads.